Detecting Application-Level Failures in Component-based Internet Services

نویسنده

  • Emre Kıcıman
چکیده

Pinpoint is an application-generic framework for using statistical learning techniques to detect and localize likely application-level failures in component-based Internet services. Assuming that most of the system is working most of the time, Pinpoint looks for anomalies in low-level behaviors that are likely to reflect high-level application faults, and correlates these anomalies to their potential causes within the system. In our experiments, Pinpoint correctly detected and localized over 70-88% of the faults, depending on the type of fault, we injected into our testbed system, as compared to the 50-70% detected by current techniques. By demonstrating the applicability of statistical learning and providing an application-generic platform on which additional machine learning techniques can be applied to the problem of fast failure detection, we hope to hasten the adoption of statistical approaches to dependability for complex software systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

JAGR: An Autonomous Self-Recovering Application Server

This paper demonstrates that the dependability of generic, evolving J2EE applications can be enhanced through a combination of a few recovery-oriented techniques. Our goal is to reduce downtime by automatically and efficiently recovering from a broad class of transient software failures without having to modify applications. We describe here the integration of three new techniques into JBoss, a...

متن کامل

Diagnosing Internet Failures Using End-to-End Measurements and Routing Data

The scale and the distributed nature of the Internet make it difficult for Internet Service Providers and end-users alike to identify the causes of failures that affect their networking services. Among the variety of problems that can occur, failures on the IP forwarding path are the hardest to troubleshoot. Under these circumstances, a number of network failure diagnosis techniques have emerge...

متن کامل

Detecting Fake Websites Using Swarm Intelligence Mechanism in Human Learning

The internet and its various services have made users to easily communicate with each other. Internet benefits including online business and e-commerce. E-commerce has boosted online sales and online auction types. Despite their many uses and benefits, the internet and their services have various challenges, such as information theft, which challenges the use of these services. Information thef...

متن کامل

Using Statistical Monitoring to Detect Failures in Internet Services

Since the Internet’s popular emergence in the mid-1990’s, Internet services such as e-mail and messaging systems, search engines, e-commerce, news and financial sites, have become an important and often mission-critical part of our society. Unfortunately, managing these systems and keeping them running is a significant challenge. Their rapid rate of change as well as their size and complexity m...

متن کامل

Recovering Internet Service Sessions from Operating System Failures Motivation and Approach

Critical Internet services such as ecommerce, online auctions, and banking run on complex, multi-tier architectures built with commodity (offthe-shelf) machines and operating systems. These stateful services are sensitive to server failures: active client sessions on these servers are lost, although the state associated with them might still be intact in a failed machine’s memory. We developed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004